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Abstract :  This  is  a  report  on  the  work  of  the  Computation  Structures  Group 
of  Project  MAC  toward  the  design  and  specification  of  a  common  base  language 
for  programs  and  information  structures.  We  envision  that  the  meanings  of 
programs  expressed  in  practical  source  languages  will  be  defined  by  rules  of 
translation  into  the  base  language.  The  meanings  of  programs  in  the  base 
language  is  fixed  by  rules  of  interpretation  which  constitute  a  transition 
system  called  the  interpreter  for  the  base  language.  We  view  the  base  lan¬ 
guage  interpreter  as  the  functional  specification  of  a  computer  system  in 
which  emphasis  is  placed  on  programming  generality  —  the  ability  of  users 
to  build  complex  programs  by  combining  independently  written  program  modules. 

Our  concept  of  a  common  base  language  is  similar  to  the  abstract  programs 
of  the  Vienna  definition  method  —  but  a  single  class  of  abstract  programs  ap¬ 
plies  to  all  source  languages  to  be  encompassed.  The  semantic  constructs  of 
the  base  language  must  be  just  those  fundamental  constructs  necessary  for  the 
effective  realization  of  the  desired  range  of  source  languages.  Thus  we  seek 
simplicity  in  the  design  of  the  interpreter  at  the  expense  of  increased  com¬ 
plexity  of  the  translator  from  a  source  language  to  the  base  language.  As  an 
illustration  of  this  philosophy,  we  present  a  rudimentary  form  of  the  base  lan¬ 
guage  in  which  nonlocal  references  are  not  permitted,  and  show  how  programs  ex¬ 
pressed  in  a  simple  block  structured  language  may  be  translated  into  this  base 
language . 

The  importance  of  representing  concurrency  within  and  among  computations 
executed  by  the  interpreter  is  discussed,  and  our  approach  toward  incorporating 
concurrency  of  action  in  the  base  langauge  is  outlined. 
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INTRC DUCTION 


The  Computation  Structures  Group  of  Project  MAC  is  working  toward  the 
design  and  specification  of  a  base  language  for  programs  and  information 
structures.  The  base  language  is  intended  to  serve  as  a  common  intermediate 
representation  for  programs  expressed  in  a  variety  of  source  programming  lan¬ 
guages. 

The  motivation  for  this  work  is  the  design  of  computer  systems  in  which 
the  creation  of  correct  programs  is  as  convenient  and  easy  as  possible.  A 
major  ingredient  in  the  convenient  synthesis  of  programs  is  the  ability  to 
build  large  programs  by  combining  simpler  procedures  or  program  modules, 
written  independently,  and  perhaps  by  different  individuals  using  different 
source  languages.  This  ability  of  a  computer  system  to  support  modular  pro¬ 
gramming  we  have  called  programming  generality  [3,  4].  Programming  gener¬ 
ality  requires  the  communication  of  data  among  independently  specified  pro¬ 
cedures,  and  thus  that  the  semantics  of  the  languages  in  which  these  pro- 
cadures  are  expressed  must  be  defined  in  terms  of  a  common  collection  of  data 
types  and  a  common  concept  of  data  structure. 

We  have  observed  that  the  achievement  of  programming  generality  is  very 
difficult  Jr  '^nventional  computer  systems,  primarily  because  of  the  variety 
of  data  *  and  access  methods  that  must  be  used  for  the  implementation 

of  large  <  _h  acceptable  efficiency.  For  example,  data  structures 

that  vai  size  and  form  during  a  computation  are  given  different  represen¬ 

tations  1  m  those  that  are  static;  data  that  reside  in  different  storage 
media  are  accessed  by  different  means  of  reference;  clashes  of  identifiers 


appearing  in  different  blocks  or  procedures  are  prevented  by  design  in  some 
source  languages  but  similar  consideration  has  not  been  given  to  the  naming 
and  referencing  of  cataloged  files  and  procedures  in  the  operating  environ¬ 
ment  of  programs.  Thes.'  Limitations  on  the  degree  of  generality  possible  in 
computer  systems  of  conventional  architecture  have  led  us  to  study  new  con¬ 
cepts  of  computer  system  organization  through  which  these  limitations  on  pro¬ 
gramming  generality  might  be  overcome. 
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In  this  effort  we  are  working  at  the  same  time  on  developing  the 
base  language  and  on  concepts  of  computer  architecture  suited  to  the  exe¬ 
cution  of  computations  specified  by  base  language  programs.  That  is,  we 
regard  the  base  language  we  seek  to  define  as  a  specification  of  the  func¬ 
tional  operation  of  a  computer  system.  Thus  our  work  on  the  base  language 
is  strongly  influenced  by  hardware  concepts  derived  from  the  requirements  of 
programming  generality  [3] . 

In  particular,  the  choice  of  trees  with  shared  substructures  as  our 
universal  representation  for  information  structures  is  based  in  part  on  a 
conviction  that  there  are  attractive  hardware  realizations  of  memory  systems 
for  tree  structured  data.  For  example,  Gertz  [8]  considers  how  such  a  memory 
system  might  be  designed  as  a  hierarchy  of  associative  memories.  Also,  the 
base  language  is  intended  to  represent  the  concurrency  of  parts  of  computa¬ 
tions  in  a  way  that  permits  their  execution  in  parallel.  One  reason  for  em¬ 
phasizing  concurrency  is  that  it  is  essential  to  the  description  of  certain 
computations  —  in  particular,  when  a  response  is  required  to  whichever  one 
of  several  independent  events  is  first  to  occur.  An  example  is  a  program 
that  must  react  to  the  first  message  received  from  either  of  two  remote 
terminals.  Furthermore,  we  believe  that  exploiting  the  potential  concurrency 
in  programs  will  be  important  in  realizing  efficient  computer  systems  that 
offer  programming  generality.  This  is  because  concurrent  execution  of  pro¬ 
gram  parts  increases  the  utilization  of  processing  hardware  by  providing  many 
activities  that  can  be  carried  forward  while  other  activities  are  blocked 
pending  retrieval  of  information  from  slower  parts  of  the  computer  system 
memory . 

Our  proposal  for  the  definition  of  a  common  base  language  may  seem  like 
a  rebirth  of  the  proposal  to  develop  a  Universal  Computer  Oriented  Language 
[24].  Thus  it  is  reasonable  to  inquire  whether  there  is  any  better  chance 
that  the  development  suggested  here  will  succeed  whereas  this  earlier  work 
did  not  result  in  a  useful  contribution  to  the  art.  Our  confidence  in 
eventual  success  rests  on  important  trends  in  the  computer  field  during  the 
past  ten  years  and  fundamental  differences  in  philosophy.  The  most  important 
change  is  the  increased  importance  of  achieving  greater  programming  gener¬ 
ality  in  future  computer  systems.  The  cost  of  acquiring  and  operating  the 
hardware  portion  of  computer  systems  has  become  dominated  by  the  expense 
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of  creating  and  maintaining  the  system  and  application  software.  At  present, 
there  is  great  interest  in  the  exchange  of  programs  and  data  among  computer 
installations,  and  in  building  complex  procedures  from  components  through 
the  facilities  of  time-shared  computers.  Computer  users  are  often  pre¬ 
pared  to  forsake  efficiency  of  programs  to  gain  the  ability  to  operate 
them  in  different  environments,  and  the  ability  t.o  use  the  program  in 
conjunction  with  other  programs  to  accomplish  a  desired  objective. 

Furthermore,  the  pace  of  programming  language  evolution  has  slowed.  It 
is  rare  that  a  fundamentally  new  concept  for  representing  algorithms  is  in¬ 
troduced.  Workers  on  programming  language  design  have  turned  to  refining 
the  conceptual  basis  of  program  representation,  providing  more  natural  modes 
of  expressing  algorithms  in  different  fields,  and  consolidating  diverse  ways 
of  representing  similar  actions.  Today,  there  is  good  reason  to  expect  that 
a  basic  set  of  notions  about  data  and  control  structures  will  be  sufficient 
to  encompass  a  usefully  large  class  of  practical  programming  languages  and 
applications.  In  particular,  the  set  of  elementary  data  types  used  in  com¬ 
putation  has  not  changed  significantly  since  the  first  years  of  the  stored 
program  computer  —  they  are  the  integers,  representations  for  real  numbers, 
the  truth  values  true  and  fralse^,  strings  of  bits,  and  strings  of  symbols  from 
an  alphabet.  Also,  considerable  attention  is  currently  devoted  to  the  de¬ 
velopment  of  useful  abstract  models  for  information  structures,  and  the  pros¬ 
pects  are  good  that  these  efforts  will  converge  on  a  satisfactory  general 
mode  1 . 

We  are  also  encouraged  by  others  who  are  striving  toward  similar  goals. 
Andrei  Ershov  is  directing  a  group  at  the  Novosibirsk  Computing  Center  of  the 
Soviet  Union  in  the  development  of  a  common  "internal  language"  for  use  in 
an  optimizing  compiler  for  three  different  languages  —  PL/I,  Algol  68,  and 
Simula  67  [7] .  The  internal  language  would  be  a  representation  common  to 
the  three  source  languages  and  is  to  serve  as  the  representation  in  which 
transformations  are  performed  for  machine  independent  optimization. 

The  "contour  model"  for  program  execution,  as  explained  by  Johnston  [10] 
and  Berry  [1]  provides  a  readily  understood  vehicle  for  explaining  the 
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semantics  of  programming  languages  such  as  Algol  60,  PL/I,  and  Algol  68 
in  which  programs  have  a  nested  block  structure.  It  is  easy  to  imagine 
how  the  contour  model  could  be  formalized  and  thus  serve  as  a  basis  for 
specifying  the  formal  semantics  of  programming  languages.  The  contour 
model  may  be  considered  as  a  proposal  for  a  common  base  language  and  as  a 
guide  for  the  design  of  computer  systems  that  implement  block  structured 
languages . 

John  Iliffe  has  for  some  time  recognized  some  of  the  fundamental  im¬ 
plications  of  programming  generality  with  respect  to  computer  organization. 
His  book  Basic  Machine  Principles  [9]  is  a  good  exposition  of  his  ideas 
which  are  argued  from  the  limitations  of  conventional  computer  hardware  in 
executing  general  algorithms.  Again,  Iliffe's  machine  defines  a  scheme  of 
program  representation  that  could  be  thought  of  as  a  common  base  language. 
However,  Iliffe  has  not  discussed  his  ideas  from  this  viewpoint. 


FORMAL  SEMANTICS 

When  the  meaning  of  algorithms  expressed  in  some  programming  language 
has  been  specified  in  precise  terms,  we  say  that  a  formal  semantics  for  the 
language  has  been  given.  A  formal  semantics  for  a  programming  language  gen¬ 
erally  takes  the  form  of  cwo  sets  of  rules  —  one  set  being  a  translator . 
and  the  second  set  being  an  interpreter .  The  translator  specifies  a  trans¬ 
formation  of  any  well  formed  program  expressed  in  the  source  language  (the 
concrete  language)  into  an  equivalent  program  expressed  in  a  second 
language  —  the  abstract  language  of  the  definition.  The  interpreter  ex¬ 
presses  the  meaning  of  programs  in  the  abstract  language  by  giving  explicit 
directions  for  carrying  out  the  computation  of  any  well  formed  abstract  pro¬ 
gram  as  a  countable  set  of  primitive  steps. 

It  would  be  possible  to  specify  the  formal  semantics  of  a  programming 
language  by  giving  an  interpreter  for  the  concrete  programs  of  the  source 
language.  The  translator  is  then  the  identity  transformation.  Yet  the  in¬ 
clusion  of  a  translator  in  the  definition  scheme  has  important  advantages. 
For  one,  the  phrase  structure  of  a  programming  language  viewed  as  a  set  of 
strings  on  some  alphabet  usually  does  not  correspond  well  with  the  semantic 
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structure  of  programs.  Thus  it  is  desirable  to  give  the  semantic  rules 
of  interpretation  for  a  representation  of  the  program  that  more  naturally 
represents  its  semantic  structure.  Furthermore,  many  constructs  present 
in  source  languages  are  provided  for  convenience  rather  than  as  fundamental 
linguistic  features.  By  arranging  the  translator  to  replace  occurrences  of 
these  constructs  with  more  basic  constructs,  a  simpler  abstract  language  is 
possible,  and  its  interpreter  can  be  made  more  readily  understandable  and 
therefore  more  useful  as  a  tool  for  the  design  and  specification  of  computer 
languages  and  systems. 

The  abstract  language  that  has  received  the  most  attention  as  a  base 
for  the  formal  semantics  of  programming  languages  is  the  lambda -cal cuius  of 
Church.  For  several  reasons  we  have  found  the  lambda  calculus  unsuited  to 
our  work.  The  most  serious  problem  is  that  the  lambda  calculus  does  not 
deal  directly  with  structured  data.  Thus  it  is  inconvenient  to  use  the 
lambda  calculus  as  a  common  target  language  for  programs  that  make  use  of 
selection  to  reference  components  of  information  structures.  It  also  rules 
out  modeling  of  sharing  in  the  form  of  two  or  more  structures  having  the  same 
substructure  as  a  component. 

A  second  defect  in  terms  of  our  goals  is  that  the  lambda  calculus  in¬ 
corporates  the  concept  of  free  and  bound  variables  characteristic  of  block 
structured  programming  languages.  We  prefer  (■  )  exclude  these  concepts  so 
the  base  language  and  its  interpreter  are  simpler  and  more  readily  applied 
to  the  study  of  computer  organization.  Later  in  the  paper  we  show  how  block 
structured  programs  may  be  translated  into  base  language  programs  using  the 
rudimentary  version  of  the  base  language  introduced  below.  This  translation 
of  block  structured  programs  into  programs  that  are  not  block  structured  is 
an  important  example  of  how  simplicity  in  the  interpreter  may  be  obtained 
by  translating  source  language  constructs  into  more  primitive  constructs. 

Our  thoughts  on  the  definition  of  programming  languages  in  terms  of  a 
base  language  are  closely  related  to  the  formal  methods  developed  at  the  IBM 
Vienna  Laboratory  [17,  18],  and  which  derive  from  the  ideas  of  McCarthy  [19,  20] 
and  Landin  [13,  14].  For  the  formal  semantics  of  programming  languages  a  gen¬ 
eral  model  is  required  for  the  data  on  which  programs  act.  We  regard  data  as 
consisting  of  elementary  objects  and  compound  objects  formed  by  combining 


elementary  objects  into  data  structures. 


Elementary  objects  are  data  items  whose  structure  in  terms  of  simpler 
objects  is  not  relevant  to  the  description  of  algorithms.  F^r  the  purposes 
of  this  paper,  the  class  E  of  elementary  objects  is 

E  =  Z  U  R  U  W 

where 


Z  =  the  class  of  integers 

R  =  a  set  of  representations  for  real  numbers 
W  =  the  set  of  all  strings  on  some  alphabet 

Data  structures  are  often  represented  by  directed  graphs  in  which 
elementary  objects  are  associated  with  nodes,  and  each  arc  is  labelled  by 
a  member  of  a  set  S  of  selectors .  In  the  class  of  objects  used  by  the  Vienna 
group,  the  graphs  are  restricted  to  be  trees,  and  elementary  objects  are  as¬ 
sociated  only  with  leaf  nodes.  We  prefer  a  less  restricted  class  so  an  ob¬ 
ject  may  have  distinct  component  objects  that  share  some  third  object  as  a 
common  component.  The  reader  will  see  that  this  possibility  of  sharing  is 
essential  to  the  formulation  of  the  base  language  and  interpreter  presented 
here.  Our  class  of  objects  is  defined  as  follows: 

Let  E  be  a  class  of  elementary  objects .  and  let  S  be  a  class  of 
selectors .  An  object  is  a  directed  acyclic  graph  having  a  single 
root  node  from  which  all  other  nodes  may  be  reached  over  directed 
paths.  Each  arc  is  labelled  with  one  selector  in  S ,  and  an  elemen¬ 
tary  object  in  E  may  be  associated  with  each  leaf  node. 


s  =  z  u  w 

Figure  1  gives  an  example  of  an  object.  Leaf  nodes  having  associated  ele¬ 
mentary  objects  are  represented  by  circles  with  the  element  of  E  written 
inside;  integers  are  represented  by  numerals,  strings  are  enclosed  in  single 


concrete  programs  abstract  programs 


Figure  2.  Language  definition  by  the  Vienna  method 
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quotes,  and  reals  have  decimal  points.  Other  nodes  are  represented  by 
solid  dots,  with  a  horizontal  bar  if  there  is  more  than  one  emanating  arc. 

The  node  of  an  object  reached  by  traversing  an  arc  emanating  from  its 
root  node  is  itself  the  root  node  of  an  object  called  a  component  of  the 
original  object.  The  component  object  consists  of  all  nodes  and  arcs  that 
can  be  reached  by  directed  paths  from  its  root  node. 

At  present,  we  rule  out  directed  cycles  in  the  graphs  of  objects  for 
several  reasons:  In  the  first  place,  the  data  structures  of  the  most  im¬ 
portant  source  languages  are  readily  modelled  as  objects  according  to  our 
definition.  Also,  it  seems  that  realizing  the  maximal  concurrency  of  com¬ 
putations  on  data  structures  will  be  difficult  to  do  with  a  guarantee  of 
determinism  if  objects  are  permitted  to  contain  cycles.  Finally,  the  pos¬ 
sibility  of  cycles  invalidates  the  reference  count  technique  of  freeing 
storage  for  data  items  no  longer  accessible  to  computations,  and  some  more 
general  garbage  collection  scheme  must  be  used.  The  general  techniques  do 
not  seem  attractive  with  regard  to  the  concepts  of  computer  organization  we 
have  been  studying  —  especially  when  data  items  are  distributed  among  sev¬ 
eral  physical  levels  of  memory. 

It  is  convenient  to  introduce  our  concept  of  i  base  language  and  its 
interpreter  by  comparison  with  the  Vienna  definition  method  as  represented 
by  the  formal  definitions  of  Algol  60  [15]  and  PL/I  [18].  The  Vienna  method 
is  outlined  in  Figure  2.  The  concrete  programs  of  the  programming  language 
being  defined  are  mapped  into  abstract  programs  by  the  translator.  A  con¬ 
crete  program  is  a  string  of  symbols  that  satisfies  a  concrete  syntax  usually 
expressed  as  a  form  of  context  free  grammar.  The  interpreter  is  a  nondeter- 
ministic  state  transition  system  defined  by  a  relation  that  specifies  all 
possible  next  states  for  any  state  of  the  interpreter.  Abstract  programs 
and  the  states  of  the  interpreter  are  represented  by  objects  (trees). 

Figure  2  shows  the  three  major  components  of  interpreter  states.  The 
'text '-component  is  the  abstract  program  being  interpreted.  The  'mem'- 
component  is  an  object  that  contains  the  values  of  variables  in  the  abstract 
program,  thus  serving  as  a  model  of  memory.  The  ' cont ' -component  of  the 
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state  contains  information  about  statements  of  the  abstract  program 
whose  execution  is  in  progress.  The  interpreter  is  specified  as  a  non- 
deterministic  system  so  activities  may  be  carried  out  concurrently  where 
permitted  by  the  language  being  defined. 

For  comparison,  note  that  a  separate  class  of  abstract  programs  and 
interpreter  are  sepcified  for  each  formal  definition  of  a  source  language; 
that  states  of  the  interpreter  model  only  the  information  structures  re¬ 
lated  to  execution  of  one  abstract  program;  and  that  statements  in  the  con¬ 
crete  program  retain  their  identity  as  distinct  parts  of  the  corresponding 
abstract  program. 

Figure  3  is  the  corresponding  outline  showing  how  source  languages 
would  be  defined  in  terms  of  a  common  base  language.  A  single  class  of 
abstract  programs  constitutes  the  base  language.  Concrete  programs  in 
source  languages  (Ll  and  L2  in  the  figure)  are  defined  by  translators  into 
the  base  language  —  the  class  of  abstract  programs  serves  as  the  common 
target  representation  for  several  source  languages.  For  this  to  be  effec¬ 
tively  possible,  the  base  language  should  be  the  "least  common  denominator" 
of  the  set  of  source  languages  to  be  accommodated.  The  structure  of  abstract 
programs  cannot  reflect  the  peculiarities  of  any  particular  source  language, 
but  must  provide  a  set  of  fundamental  linguistic  constructs  in  terms  of  which 
the  features  of  these  source  languages  may  be  realized.  The  translators 
themselves  should  be  specified  in  terms  of  the  base  language,  probably  by 
means  of  a  specialized  source  language.  Formally,  abstract  programs  in  the 
base  language,  and  states  of  the  interpreter  are  elements  of  our  class  of 
objects  defined  above. 

The  structure  of  states  of  the  interpreter  for  the  base  language  is 
shown  in  Figure  4.  Since  we  regard  the  interpreter  for  the  base  language 
as  a  complete  specification  for  the  functional  operation  of  a  computer  sys¬ 
tem,  a  state  of  the  interpreter  represents  the  totality  of  programs,  data, 
and  control  information  present  in  a  computer  system.  In  Figurt  4  the 
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concrete  programs  in  Li 


programs  in  Dose 
language 


irarvfi 


Figure  3.  Language  definition  in  terms 
of  a  common  base  language. 


'  local  structure'  'control 


Figure  4.  Structure  of  objects  representing  states 
of  the  base  language  interpreter. 
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universe  is  an  object  that  represents  all  information  present  in  the  computer 
system  when  the  system  is  idle  —  that  is,  when  no  computation  is  in  progress 
The  universe  has  data  structures  and  procedure  s tructures  as  constituent 
objects.  Any  object  is  a  legitimate  data  structure;  for  example,  a  data 
structure  may  have  components  that  are  procedure  structures.  A  procedure 
structure  is  an  object  that  represents  a  procedure  expressed  In  the  base 
language.  It  has  components  which  are  Instructions  of  the  base  language, 
data  structures,  or  other  procedure  structures.  So  that  multiple  activa¬ 
tions  of  procedures  may  be  accommodated,  a  procedure  structure  remains  un¬ 
altered  during  its  interpretation. 

The  loca 1  s tructure  of  an  interpreter  state  contains  a  local  structure 
for  each  current  activation  of  each  base  language  procedure.  Each  local 
structure  has  as  components  the  local  structures  of  all  procedure  activa¬ 
tions  initiated  within  it.  Thus  the  hierarchy  of  local  structures  represents 
the  dynamic  relationship  of  procedure  activations.  One  may  think  of  the 
root  local  structure  as  the  nucleus  of  an  operating  system  that  initiates 
independent,  concurrent  computations  on  behalf  of  system  users  as  they  re¬ 
quest  activation  of  procedures  from  the  system  files  (the  universe). 

The  local  structure  of  a  procedure  activation  has  a  component  object 
for  each  variable  of  the  bisc  language  procedure.  The  selector  of  each  com¬ 
ponent  is  its  identifier  in  the  instructions  of  the  procedure.  These  ob¬ 
jects  may  be  elementary  or  compound  objects  and  may  be  common  with  objects 
within  the  universe  or  within  local  structures  of  other  procedure  activations 

The  control  component  of  an  interpreter  state  is  an  unordered  set  of 
sites  of  activity .  A  typical  site  of  activity  is  represented  in  Figure  4 
by  an  asterisk  at  an  instruction  of  procedure  P  and  an  arrow  to  the  local 
structure  L  for  some  activation  of  P.  This  is  analogous  to  the  "Instruction 
polnter/environraent  pointer"  combination  that  represents  a  site  of  activity 
in  Johnston's  contour  model  (10).  Since  several  activations  of  a  pro¬ 
cedure  may  exist  concurrently,  there  may  be  two  or  more  sites  of  activity 
involving  the  same  instruction  of  some  procedure,  but  designating  different 
local  structures.  Also,  within  one  activation  of  a  procedure,  several 
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instructions  may  be  active  concurrently;  thus  asterisks  on  different  in¬ 
structions  of  a  procedure  may  have  arrows  to  the  same  local  structure. 

Each  state  transition  of  the  interpreter  executes  one  instruction  for 
some  procedure  activation,  at  a  site  of  activity  selected  arbitrarily  from 
the  control  of  the  current  state.  Thus  the  interpreter  is  a  nondetor- 
ministic  transition  system.  In  the  state  resulting  from  a  transition,  che 
chosen  site  of  activity  is  replaced  according  to  the  sequencing  rules  of 
the  base  language.  Replacement  with  two  sites  of  activity  dcaignatirj  two 
successor  instructions  would  occur  in  interpretation  of  a  fork  Instruction; 
deletion  of  the  site  of  activity  without  replacement  would  occur  in  execu¬ 
tion  of  a  quit  or  loin  instruction. 


INTERPRETATION  OF  A  RUDIMENTARY  BASE  LANGUAGE 

Next  we  show  how  typical  instructions  of  a  rudimentary  base  language 
would  be  implemented  by  state  transitions  of  an  interpreter.  This  will  put 
the  concepts  expressed  above  into  more  concrete  form,  and  provide  a  basis 
for  understanding  the  translation  of  block  structured  languages  into  the 
base  language.  Because  cons iderat ion  of  concurrency  in  programs  has  led 
to  concepts  of  program  representation  unfamiliar  to  most  readers,  and  be¬ 
cause  these  concepts  are  not  sufficiently  advanced,  we  will  use  for  illus¬ 
tration  a  base  language  employing  conventional  instruction  sequencing.  The 
instructions  of  a  procedure  are  objects  selected  v  tuunsive  integers, 
with  0  being  the  selector  of  the  Initial  instruct 

The  effect  of  representative  instructions  on  the  interpreter  state  Is 
shown  in  Figures  5  through  11  in  the  form  of  before/after  pictures  of  rele¬ 
vant  state  components.  In  these  figures,  P  marks  the  root  of  the  procedure 
structure  containing  an  instruction  under  consideration  as  its  i-co»ponent, 
and  L(P)  is  the  root  of  the  local  structure  for  the  relevant  activation  of 
P. 
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The  add  instruction  is  typical  of  instructions  that  apply  binary 
operations  to  elementary  objects.  The  Instruction 

add  'u',  'v't  'w' 

is  an  object  having  as  components  the  four  elementary  objects  'add',  'u', 

'v't  and  'v'.  These  arc  interpreted  as  an  operation  code  and  three  "address 
fields"  used  as  selectors  for  operands  and  result  in  the  local  structure 
L(P).  The  state  transition  is  shewn  in  Figure  5.  Note  that  the  site  of 
activity  advances  sequentially  to  the  i  +  1-componerl  of  P. 

Let  us  say  that  a  procedure  activation  has  direct  access  to  a  data 
structure  if  the  data  structure  is  the  s-componcnt  of  the  local  structure 
for  some  selector  s.  The  instruction 

MUcl  p  t  n  t  q 

is  used  to  gain  direct  access  to  the  'n ' -component  of  a  data  structure  to 
which  direct  access  exists.  This  Instruction  makes  Che  object  that  is  the 
*p'  -'n* -component  of  L(P)  also  the  ‘q  ’ -component  of  L(P)  as  shown  by  Figure  6, 
Literal  values  are  retrieved  from  the  procedure  structure  by  conf t 
instructions  such  as 


const  1.5,  *x  * 

v  ,lch  makes  the  elementary  object  1.5  the  ‘x  ’ -component  of  L(P).  5*  «»ct  and 

cony.  instructions  cay  be  used  to  build  arbitrary  data  structures  as  illus¬ 
trated  in  Figure  7.  Note  that  execution  of  sejhyML  ‘p’,  ’n‘,  ’a*  lollies 
creation  of  an  *n' -component  of  the  object  selected  by  ‘p*  if  none  already 
exists. 

Figur?  ft  shows  htw  the  instruction 


V 


establishes  an  arc  between  two  objects  (the  'p'*  and  'q ' -components  of  L(F>) 
to  v*  ich  direct  access  exists.  Lxecullon  of  this  instruction  the 

'q '-component  of  L(P)  also  the  'p '• *n* -component  of  L(P). 
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(0)  (b) 


Figure  5.  Intoprctatlon  of  an  Instruction  specifying  a  binary  operation. 


Figure  6.  Interpretation  of  a  select  Instruction. 


(o) 


Figure  ?.  Structure  building  •»*!«*  select  end  instruction*. 
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Thc  Hr»k  Instruction  is  the  means  for  establishing  sharing  —  making 
one  object  a  common  component  of  two  distinct  objects.  Unless  some  re¬ 
striction  Is  built  Into  the  base  language  or  Its  Interpreter,  use  of 
1  Ink  Instructions  can  Introduce  cycles  Into  the  Interpreter  state.  At 
present  we  do  not  know  how  use  of  link  Instructions  should  be  limited  so 
Introduction  of  cycles  cannot  occur.  One  way  In  which  cycles  can  arise 
occurs  In  the  Interpretation  of  block  structured  programs  by  the  scheme 
given  In  the  next  section  of  the  naper. 

The  Instruction 


1  n  ' 

p  » 


erases  the  arc  labelled  'n'  emanating  from  t* e  root  of  the  ’p ' -component 
of  L(P).  Any  nodes  and  arcs  that  arc  unrooted  after  the  erasure  cease  to 
be  part  of  the  Interpreter  state,  as  shown  In  Figure  9. 

Activation  of  a  new  procedure  Is  accomplished  by  the  Instruction 


whore  the  ' f ' -component  of  L(P)  Is  the  procedure  structure  F  of  the  pro¬ 
cedure  to  be  activated,  and  the  'a 1 -component  of  L(P)  Is  an  object  (an 
ireuasnt  Structure)  that  contains  as  components  all  data  requited  by  the 
procedure  (e.g.,  actual  parameter  value*)  to  perform  Its  function.  Execu¬ 
tion  of  the  apply  Instruction  causes  the  state  transition  Illustrated  In 
Figure  10:  A  root  node  L(F)  Is  created  for  the  local  structure  of  the  new 
activation;  the  argument  structure  le  made  the  A-cowponent  of  L(F);  a  new 
sire  of  activity  Is  denoted  by  an  asterisk  on  the  O-camponent  of  F  and  an 
arrow  to  L(F):  and  the  original  site  of  activity  la  advanced  to  the 
l*l-lnsiructlon  of  F  and  aside  dormant  as  Indicated  by  the  parentheses. 

A  procedure  activation  I*  terminated  by  the  Instruction 


retuyn 

which  cause*  the  *tate  transition  displayed  In  Figure  11.  The  root  node 
Iff)  Is  erased,  deleting  all  parts  of  the  local  structure  of  T  that  are  not 
linked  to  the  argument  structure;  the  site  of  activity  at  the  rgj^ugn  In¬ 
struction  disappears;  and  the  dormant  site  of  activity  In  the  activating 
procedure  Is  activated.  Jiote  that  the  entire  effect  of  executing  procedure 
F  Is  conveyed  to  the  activation  of  P  by  way  of  the  argument  structure. 
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Plgure  9.  Hip  effect  of  executing  a  do  lete  In* cruet l on. 


structure 


Figure  10.  Initiation  of  a  procedure  activation 
by  an  aggj^  instruction. 


structure 


structure  structure 


Figure  11.  Termination  of  a  procedure  activation  by 
a  return  instruction. 


-23- 


To  apply  a  procedure,  its  procedure  structure  must  be  a  component 
of  the  local  structure  of  the  current  procedure  activation.  If  the  pro¬ 
cedure  to  be  activated  is  the  'g 1 -component  of  the  procedure  structure  P 
in  execution,  execution  of  the  instruction 

move  ' g 1 ,  ' f • 

will  make  it  directly  accessible  by  identifying  the  ' f ' -component  of  L(P) 
with  the  'g ' -component  of  P. 

TRANSLATION  OF  BLOCK  STRUCTURED  LANGUAGES 

Many  important  programming  languages  for  practical  computation  are 
block  structured;  the  texts  of  blocks  and  procedures  are  nested,  and  identi¬ 
fiers  in  one  text  may  refer  to  variables  defined  in  other  texts.  Since  we 
do  not  plan  to  include  in  the  base  language  provision  for  directly  repre¬ 
senting  references  by  a  procedure  to  external  objects,  we  must  show  how  the 
execution  of  block  structured  programs  may  be  simulated  through  translation 
into  the  base  language  and  execution  by  the  base  language  interpreter.  The 
following  discussion  gives  one  way  in  which  this  may  be  accomplished  —  a 
way  that  seems  attractive  in  relation  to  the  concepts  of  computer  organiza¬ 
tion  we  are  investigating.  This  discussion  also  serves  as  a  good  example  of 
how  complexity  in  a  source  language  may  be  represented  in  the  rules  of  trans¬ 
lation  rather  than  in  the  rules  of  interpretation  of  a  formal  definition. 

For  this  discussion  we  will  use  an  elementary  block  structured  language. 
Identifiers  are  declared  by  the  lines 

integer  x  or  proced  x 

to  denote  simple  variables  or  procedures.  Basic  statement  types  include: 
Assignment  statements  such  as 

x  :=  g(u,  v) 

where  x,  u,  and  v  are  simple  variable  identifiers,  and  g  denotes  an  un¬ 
specified  function;  procedure  applications  of  the  form 
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apply  f(x,  y)  or  z  :=  apply  f(x,  y) 

where  f  is  a  procedure  identifier,  the  second  form  being  used  for  a  value 
returning  procedure;  aid  conditional  statements  like 

if  p(x)  then  SI  else  S2 

and  iteration  statements  like 

while  p(x)  do  SI 

where  p  denotes  an  unspecified  predicate  and  Si  and  S2  are  basic  statements 
or  a  sequence  of  statements  delimited  by  begin,  end . 

A  procedure  variable  f  may  be  assigned  a  value  by  a  declaration  state¬ 
ment  having  the  form 

f  :=  procedure  (x,  y) 

begin 

end 

where  x, . . . ,y  are  the  formal  parameters.  A  statement 
return  z 

specifies  the  result  of  a  value  returning  procedure.  The  lines  between  begin 
and  end,  together  with  the  list  of  formal  parameters,  make  up  the  text  of  the 
procedure . 

A  program  in  this  language  has  the  form  of  a  nested  set  of  procedure 
declarations.  Except  for  the  text  of  the  outermost  declaration,  each  text 
is  enclosed  by  the  text  within  which  its  declaration  appears.  As  in  Algol  60, 
each  identifier  is  local  to  the  text  in  which  it  is  declared,  and  the  meaning 
of  a  nonlocal  appearance  of  an  identifier  is  defined  to  be  the  same  as  its 
meaning  in  the  enclosing  text.  The  formal  parameters  of  a  procedure  are 
local  identifiers  of  the  text  being  declared. 
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The  meaning  of  block  structured  programs  can  be  expressed  in  terms 
of  a  tree  of  symbol  tables  as  has  been  explained  by  Weizenba  im  [26],  or 
in  terms  of  the  contour  model.  The  interested  reader  should  study  the 
work  of  Berry  [2]  and  Lucas  [16]  for  other  discussions  of  formal  implemen¬ 
tations  of  block  structured  programs  and  their  equivalence. 

To  simulate  the  execution  of  a  block  structured  program  by  a  base 
language  program,  we  need  a  scheme  for  implementing  the  nonlocal  ref¬ 
erences  of  the  source  program.  Our  method  is  to  augment  the  argument 
structure  associated  with  a  procedure  activation  in  the  base  language  in¬ 
terpreter  so  that  all  external  objects  to  which  reference  is  required  by 
the  block  structured  procedure  are  accessed  as  components  of  the  argument 
structure. 

To  make  matters  precise,  it  is  convenient  to  adopt  some  notation.  Sup¬ 
pose  T  is  the  text  of  a  procedure  declaration.  We  write  B(T)  to  denote  the 

set  of  identifiers  declared  within  T  (local  to  T) .  The  set  X(T)  of  external 

identifiers  associated  with  text  T  is  defined  as  follows:  We  write  T'  <  T 
if  text  T'  is  nested  within  text  T,  that  is,  if  there  is  a  sequence  of 
texts  TQ,  Tj ,  ...,  Tk  such  that  T  =  TQ,  T'  =  Tk,  and  T.  encloses  T  for 
i  =  0,  ...,  k-1.  Then  X(T)  contains  each  identifier  x  that  has  a  nonlocal 

appearance  in  some  text  T',  T'  £  T,  and  is  not  local  to  any  text  T", 

X'  <  T"  T. 

In  these  terms  we  can  describe  the  formats  of  the  local  structures  and 
argument  structures  to  be  used  in  simulation  of  block  structure  in  the  base 
language.  Corresponding  to  the  activation  record  for  an  activation  of  pro¬ 
cedure  text  T,  a  local  structure  (L-structure)  is  formed  by  the  base  lan¬ 
guage  program.  The  L-structure  has  the  format  shown  in  Figure  12a.  It  has 
an  E-component  in  which  a  value  is  associated  with  each  identifier  in 
B(T)  U  X(T),  that  is,  each  local  and  each  external  identifier  of  T.  The  L- 
structure  also  includes  components  for  temporary  values  required  by  the  base 
language  instructions  that  interpret  the  text  T. 

Ihe  argument  structure  (A-structure)  for  an  activation  of  procedure 
text  T  will  have  one  component  for  each  formal  parameter  of  the  text  T,  and 
in  addition,  an  E-component  that  conveys  access  to  objects  referenced  by 
the  external  identifiers  of  T,  as  shown  in  Figure  12b. 

A  procedure  identifier  is  given  a  value  by  a  procedure  declaration 
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(a)  L-structure 


(b)  A-structure  (c)  C-structure 
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temporaries  t 
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Figure  12.  Formats  of  local,  argument,  and  closure 

structures  for  the  interpretation  of  block 
structured  programs. 
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statement  including  a  text  T.  Because  procedure  values  may  be  assigned 
to  nonlocal  identifiers,  and  may  be  passed  to  the  calling  activation  by  a 
value  returning  procedure,  activations  of  the  text  T  may  occur  in  situations 
where  there  is  no  clear  meaning  for  the  external  identifiers  of  T.  The  usual 
solution  to  this  problem  is  to  let  a  procedure  value  be  an  object  called  a 
closure  of  the  text  T  (a  C- structure)  having  two  components  as  in  Figure  12c. 
The  T-component  of  a  closure  is  the  text  itself.  The  E-component  (environment) 
includes  an  x-component  for  each  x  in  X(T),  and  gives  an  activation  of  the 
text  access  to  objects  referenced  by  its  external  identifiers. 

Usually,  the  meaning  of  the  external  identifiers  of  a  closure  of  T  is 
fixed  at  the  time  the  closure  is  created  by  execution  of  the  declaration  of 
T.  Each  x  €  X (T)  is  given  the  same  meaning  as  the  current  meaning  of  x  in 
the  text  T*  that  encloses  the  declaration  statement. 

The  way  in  which  block  structured  programs  may  be  simulated  by  the  base 
language  interpreter  is  best  introduced  by  an  example.  The  following  pro¬ 
gram  is  adapted  from  Weizenbaum's  paper  [26] : 


program  1.: 


p  :=  procedure 

begin  proced  f,  q,  r;  integer  u,  v,  z 

f  :=  procedure  (x);  x 

begin  proced  g 


end 


F—H 


g  :=  procedure  (y);  integer  y 
begin  integer  t 

t  :=  (x  +  y)  t  2 
return  t 

end 

return  g 


end 

q  :=  apply  f(l) 
r  :=  aggly  f(2) 
u  :=  apply  q(3) 
v  :=  aggly  r (5 ) 
z  :=  u  +  v 
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The  program  consists  of  three  procedure  texts  P,  F  and  G  having  local  and 
external  identifiers  as  follows: 

B (P)  =  (f,  q,  r,  u,  v,  z)  B(F)  =  (x,  g}  B(G)  =  ft,  y) 

X(P)  =  0  X (F)  =  0  X(G)  =  {x} 

Following  Weizenbaum  and  Johnston,  wo  display  the  progress  of  a  compu¬ 
tation  by  giving  a  series  of  snapshots  of  the  interpreter  state,  chosen  to 
illustrate  points  about  the  execution  mechanism.  For  procedure  P,  the 
initial  state  of  the  interpreter  (Snapshot  1,  Figure  13)  includes  the  text 
of  P  in  the  form  of  a  procedure  structure.  This  procedure  structure  is  in 
fact  a  tree  of  procedure  structures;  for  each  text  T  £  P,  the  procedure 
structure  for  T  has  as  a  component  a  procedure  structure  for  each  text  en¬ 
closed  by  T.  We  will  not  describe  further  the  coding  of  procedure  texts  as 
sets  of  instructions,  as  the  required  instruction  sequences  will  be  cleat 
from  the  discussion  of  the  state  transitions  seen  in  the  series  of  snapshots. 
The  initial  state  also  includes  a  local  structure  L(P)  that  will  serve  as  the 
activation  record  for  procedure  P;  it  is  empty  except  for  the  argument  struc¬ 
ture  A(P),  which  consists  of  an  empty  E-component. 

For  clarity,  the  arcs  that  make  each  argument  structure  a  component  of 
the  local  structures  of  the  calling  and  called  procedures  are  omitted  from 
the  snapshots.  Also,  we  will  not  include  the  procedure  structure  for  P  in 
subsequent  snapshots,  its  presence  being  understood  throughout  the  computation. 

The  first  step  performed  by  instructions  of  the  base  language  represen¬ 
tation  of  P  is  to  create  an  E-component  of  its  L-structure,  and  an  E*'x'- 
component  for  each  identifier  x  in  B(P)  U  X(P)  =  (f,  q,  r,  u,  v,  z).  Execution 
of  the  declaration  of  text  F  yields  snapshot  2.  The  E* 1 f' *C- component  of  L(P) 
is  now  a  closure  of  F  represented  by  a  C-structure.  Its  T-component  is  the 
text  of  F  and  is  shared  with  the  text  of  P,  its  E-component  is  empty  because 
X (F)  =  0. 

The  first  step  in  the  execution  of 

q  :=  apply  f (1) 

is  to  form  an  appropriate  argument  structure  A(Fl).  Its  1-component  is  the 
actual  parameter  value,  and  its  E-component  is  empty,  again  because  X(F)  =  0. 
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Snapshot  1  Snapshot  2 


Snapshot  3 


Snapshot  4 


?L(P)  fA(Fl)  <?L(Fl) 


Figure  13.  Interpretation  of  a  block  structured  program  — 
formation  of  a  closure  of  text  G. 
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Acrlvatlon  of  the  procedure  structure  for  F  creates  an  L-structure  L(F1). 

T»ic  firat  action  by  instructing  of  F  is  to  associate  actual  parameters 
with  identifiers  in  B(F).  Thus  the  E* 'x '-component  of  L(Fl)  is  linked  to 
the  1 -component  of  A (FI )  as  in  snapshot  3. 

Snapshot  4  shows  the  effect  of  interpreting  the  declaration  of  C.  This 
aids  a  closure  of  C  as  the  E* 'g'*C- component  of  L(F1).  The  meaning  of  iden¬ 
tifier  x,  which  is  an  external  identifier  of  G,  is  fixed  in  the  closure  by 
waking  the  t*  ’x  '-conif  onent  of  the  closure  identical  with  the  E* 'x '-component 
of  the  current  L-atructurc.  Snapshot  4  also  shows  the  effect  of  the  state- 
nent  return  g  which  links  the  E* 'g ' -component  of  L(Fl)  as  the  R-co«ponent 
(result  value)  of  the  argument  structure  A(F1).  This  action  completes  execu¬ 
tion  of  the  Instructions  of  F,  hence  L(F1)  is  deleted  and  execution  of  in¬ 
structions  cf  P  la  resumed.  To  complete  interpretation  of  the  statement 
q  .«  f(l),  Che  R-componcnC  of  A ( Fl )  is  eado  the  E« 'q 1 -component  of  L(P),  and 

A(F1  I  is  deleted.  The  result  is  shown  in  snapshot  5  (Figure  14),  which  also 
shows  the  effect  of  interpreting  r  f(2)  by  a  similar  sequence  of  events 

The  progress  of  this  computation  through  snapshot  5  illustrates  how 
values  required  to  interpret  external  references  may  be  conveyed  to  a  pro¬ 
cedure  activation  via  the  argument  structure,  and  how  closures  of  „  text  <*wy 
be  formed  to  fix  the  meaning  of  the  external  (free)  identifiers  in  a  pro¬ 
cedure  declaration  —  all  without  going  outside  the  base  language  features 
we  have  introduced.  The  remaining  snapshots  show  what  is  Involved  in  ap- 
plytng  a  closure  with  a  nonempty  E-compone it . 

Interpretation  of  the  statement  u  apply  q(3)  begins  with  formation  of  an 
argument  structure  A(G1)  as  in  snapshot  6,  Figure  14.  Here,  since  X(G)  •  (x), 
an  E*  x  -component  of  A(G1)  is  created  and  made  identical  with  the  E*'x'* 
component  of  the  closure  value  of  q  in  L(P).  Then  the  initial  instructions 
of  C  identify  the  E* 'y ' -component  of  L(Gl)  with  the  l-component  of  A(Cl), 
and,  since  x  (  X(G),  identify  the  E- 'x ' -component  of  L(G1)  with  the  E*'x'- 
componont  of  A(G1).  Instructions  corresponding  to  tha  body  of  G  compute  the 
value  t  -  (l  +  3)  2  •  16  which  is  returned  as  the  R-cowponent  of  A(Gl). 

Hie  result  is  snapshot  7  which  includes  the  effect  of  interpreting  the  state¬ 
lets  v  :•  ajyjl^  r(5)  and  e  :•>  u  +  v. 
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Snopthof  5 


*  : 
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Figure  14.  Interpretation  of  f  block  atructurerf 
program  —  application  of  cloaurea. 


o:- 


h'ilh  (hr  tlxtYt  eaanpte  as  4  guide,  w*  can  fomuUlc  *  »*t  #f 

rul®»  (ovcrnifljt  (h«  tliMUdon  of  Mik^  structured  programs  by  the  base 

t  f  mHw  of  an  4f£»#B*nt  structure  for  application  of  a  closure 
of  tact  T 

f  ll 

•  The  |  actual  juusefer  for  appl leal  Ion  of  teat  T  | a  *ta4e 
the  1  -c<WpOne  ->t  of  A(T). 

b  rot  each  identifier  a  i«  tfTl.  the  T • *a * • component  of  the 
closure  of  T  to  he  applied  Is  na4»  fh«  £  •  ‘a  *  -component  of 

Mil 

2  l«i  1 1  a  1 1  ea  t  Ion  of  the  local  structure  l,(Tl* 

a.  Tot  each  «  f  X(f),  the  r  •  *a  1 -component  of  A(?)  i«  ita4*  the 
T  •  *1  ‘ -component  of  t,(T). 

b.  for  each  a  f  l(T),  an  «iptt  t,,i,,c<»?often(  I*  appended  to 

W,. 

c.  lach  actual  parameter  value  (1 -component  of  Af7)>  i»  nade  the 
I • *a * ■ component  of  LfT),  vhere  a  le  the  identifier  of  the  i 
formal  parameter. 

).  *<il«rn  of  value* 

a.  l"terpr*  tat  Ion  of  the  statement  fejiffn  a  «taVes  the  1*  component 
of  A(T)  identical  vlth  the  f,  •  *a  * -component  of  LfT). 

b.  Interprets!  ion  of  *  '*  apply  f(  ,  1  l»-  teat  T  I*  completed 

by  naVlne  the  t* ‘-component  of  1,(1)  identical  Vlth  the  *• 
component  of  the  argument  structure  for  application  of  the 
closure  f 

c  The  argument  structure  is  deleted  from.  t,fT). 

4,  Formation  of  a  closure  of  teat  T  as  the  value  of  identifier  f  in 
an  activation  of  teal  T‘ : 

a  The  rev  C'Structure  is  the  f.  •  *  f  *  •  C  *  fonpsa*  n  t  of  1,(7  ). 

b  The  leal  T  Is  nade  the  7-ccmponent  of  the  C*«truclure 

c  Tor  each  a  f  X(f).  the  f • ’a *• component  of  t(T’)  i*  made  the 

*•  *a  ' -component  of  the  Oelruclure 


).  Inic/pf et*t |-*n  of  «  pro(«i)ur«  it4{«icn(  (  i*  c  in 

U«  *i 

4  T%*  II  •  ‘t  *  'C-cwwpKment  of  LfT)  l«  «*»4e  (h«  t*  *  (  '  •  C- component 

of  L(I),  ;  pf«v|OM»|jf  Ml*{in£  C-*re  from  prot«4uf(  no4« 

f,  •  *  f  *  be  l r g  4etc(t4 

ctcuii  {£o  ng  i%  tttmyrtcK 

TKe  rie cSod  of  tin^Utini  Hle<V  •(r<!((>ir«4  pr>>cfl(<4  *Hove  It** 

*  «*•  lot  4»f*ci  in  tern*  of  wr  for  {h«  btie  |«r{iu{c  tn(«rpt*u* 

lien  of  proiftJM  r«n  U«4  to  Interpreter  t[*u*  f>f  vMch  tHe  gr*ph  of  (Ht 
*****  H*e  4lr*a*4  rjfiU*  l«4  I*  m(  «n  object  *ccar4|n£  (o  »4f  definition. 
n>e  •  l tip  1**1  C4IC  I*  the  follcvlng  pfO>(r«« 

ELZJLtM  l 

(«) 

rro<et>f  (*):  lct»£*r  * 

U  «  »  0  tbs*  r«tM_rp  I 
«  !*«(*> 
c  !-  «ff ly  f(*) 
fet.if-  * 

f(*  ) 

IH«  tftipthoi  In  Tit  ,«  IS*  »bev»  the  el  tuition  J«**t  #fl»f  ItitefpfeUllw  of 
l He  <i«c l*r * 1 1 on  of  t**t  f.  THe  cycle  »riie«  bec*u»e  of  the  f ree  occurrence 
of  f  olthln  (•«(  T,  vS«re  the  v«lu*  of  f  I*  *  closure  of  f. 

To  »#o4*r*r*nd  In  e*ner*l  ihe  condition*  under  vHlch  cycle*  «re  Introduced, 
II  I*  Ireirwcllv*  to  w*e  4l*ir*tn  ihovlm  *11  nontec*!  reference*  to  l He  pro¬ 
cedure  v*r  le »  of  pro|r#ffi*  Heine  *iudle.l .  a  procedure  ***rl*Hle  I*  uniquely 

•  pec I f led  Hy  *  p*lr  (*,  T)  vHere  *  I*  •  procedure  Identifier  end  T  l*  *  tent 
In  vHlch  *  l*  declared,  tn*t  1?,  *  £  i(T).  Ve  vrlte  R(x,  T)  to  represent 


•  w- 


figure  15.  Interpreter  stete  for  prugrs*  2 

showing  the  cycle  Introduced,  end 
the  'corresponding  procvsr  dUgrea, 


figure  16.  Procvsr  disgrssi  for  progrsa  1 

shoving  absence  of  necessary  con¬ 
ditions  for  the  occurrence  of  cycles. 
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ihe  range  of  a  procedure  variable ;  R(x,  T)  1*  a  tec  containing  each  text  T' 
ruch  chat  the  variable  (x,  T)  could  be  assigned  a  closure  of  T*  as  Its  value 
during  program  execution.  The  range  of  a  procedure  variable  may  be  deter- 
alned  by  tracing  references  to,  and  assignments  of,  the  closures  defined  by 
procedure  declarations .  We  suspect  that,  unless  a  program  has  redundant  or 
unproductive  statements,  there  will  be  some  Interpretation  for  its  function 
and  predicate  symbols  such  that  each  element  of  the  range  of  a  procedure 
variable  occurs  as  its  value  in  gome  computation  by  the  program. 

To  construct  the  procedure  variable  diagram  (proevar  diagram  for  short) 
of  a  block  structured  program  represent  the  texts  of  the  program  by  closed 
contours  nested  In  the  same  way  as  the  texts.  The  area  inside  the  contour 
for  T  but  outside  contours  for  texts  enclosed  by  T  is  the  locality  of  T. 

Let  (x,  T)  be  a  procedure  variable  of  the  program,  and  represent  It  by  a 
solid  dot  labelled  x  and  placed  in  the  locality  of  T.  Place  a  small  open 
circle  in  the  locality  of  T*  for  each  text  T'  with  T'  <  T  ill  which 
Identifier  x  refers  to  the  procedure  variable  (x,  T).  Join  each  of  these 
circles  to  the  solid  dot  denoting  (x ,  T)  by  arcs  without  arrows.  For  each 
text  T'  in  the  range  of  variable  (x,  T),  draw  an  arrow  from  the  solid  dot 
representing  (x,  T)  to  the  contour  for  T' .  Repeat  these  steps  for  each  pro¬ 
cedure  variable  of  the  program. 

The  proevar  diagram  for  prograr  is  shown  in  Figure  15b,  and  the  diagram 
for  program  1  appears  in  Figure  16. 

next  we  formulate  a  necessary  condition  for  a  block  structured  program 
to  generate  cycles  when  Interpreted  according  to  our  rules  of  simulation. 

First  consider  the  forms  a  cycle  must  have  in  an  interpreter  state.  There 
are  nine  kinds  of  nodes  involved  in  the  interpretation  of  block  structured 
programs : 

L:  root  nodes  of  L-struetures 

L«E:  environment  nodes  of  L-structurcs 
A:  root  nodes  of  A-structurcs 

A«E:  environment  nodes  of  A-Gtiucturcs 
S:  simple  variable  nodes 

P:  procedure  variable  nodes 

C:  root  nodes  of  C-structures 

C«T:  text  nodes  of  C-structurcs 

C»E:  environment  nodes  of  C-structures 
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Of  these,  types  L  and  A  cannot  occur  in  cycles  because  no  action  by  the 
interpreter  creates  any  arcs  terminating  on  L-nodes  or  A-nodes  (aside  from 
the  implicit  links  we  have  omitted  from  the  diagrams).  Further,  arcs 
terminating  on  L*E-  or  A*E-nodes  can  only  emanate  from  L-  and  A-nodes, 
respectively.  Hence  these  node  types  cannot  occur  in  cycles.  No  arcs 
emanate  from  S-nodes,  and  no  arcs  from  procedure  structures  terminate  on 
nodes  of  L-structures ;  therefore  S-nodes  and  T-nodes  cannot  occur  in 
cycles.  These  considerations  leave  just  three  kinds  of  arcs  that  can  be 
members  of  any  cycle  (x  is  some  procedure  identifier): 

C*E  P 

1 

x 

l 

P 

Thus  a  cycle  in  an  interpreter  state  consists  of  a  series  of  triplets,  each 
triplet  having  one  of  each  kind  of  arc,  in  the  order  shown  above.  From  this 
reasoning,  we  deduce  that  a  cycle  arises  from  interpretation  of  a  block  struc 
tured  program  only  if  there  is  a  finite  sequence  of  texts  T^,  T^,  ...»  T^, 
and  a  corresponding  sequence  of  identifiers  x^,  X£,  •••»  that  meet  these 
conditions : 

1.  Each  x^  is  an  external  procedure  identifier  of  T^ :  xi  €  XCT^). 
Let  (x^,  T|)  be  the  procedure  variable  denoted  by  x^  in  text  T^. 
Note  that  Tt  <  T^. 

2.  For  each  i  and  with  j  =  (i  mod  k)  +  1,  T^  is  in  the  range  of 
(xt,  T'). 

These  conditions  imply  that  the  procvar  diagram  of  a  program  has  a 
cycle  of  arrows  such  that  each  arrow  terminates  on  the  contour  of  a  text  that 
contains  an  external  reference  to  the  procedure  variable  from  which  the  next 
arrow  emanates.  For  program  2,  Figure  15b  shows  a  cycle  that  involves  just 
one  procedure  variable  (f,  F). 

Program  3  below  is  a  nest  of  procedures  activated  recurs i  ly . 


C-E 
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The  procvar  diagram  for  this  program  is  shown  in  Figure  17a,  and  Figure  17b 
illustrates  the  interpreter  state  resulting  from  simulation  through  the  first 
activation  of  text  G.  Still  the  cycle  only  involves  procedure  variable  (f,  P) 
because  the  only  external  reference  is  the  appearance  of  f  in  text  G. 

The  sort  of  program  that  leads  to  more  complex  cycles  is  illustrated  by 


) 

) 


)  ...  end 

)  ...  end 


Figure  18  gives  the  procvar  diagram  for  program  4  and  shows  the  state  of  the 
interpreter  after  the  declarations  of  F  and  G  have  been  executed.  The  cycle 
involves  procedure  variables  (f,  P)  and  (g,  P). 

We  have  found  that  many  block  structured  programs  can  be  rewritten  so 
they  accomplish  the  original  computation  but  no  longer  satisfy  the  necessary 
condition  for  the  creation  of  cycles.  The  principle  is  to  convey  closures 
to  and  from  a  procedure  activation  by  passing  them  as  parameters  or  results 
rather  than  by  external  references.  In  this  way,  the  three  example  programs 
may  be  rewritten  as  the  three  transformed  programs  given  below.  In  each  case 
the  texts  of  the  transformed  programs  do  not  contain  any  external  references 
to  procedure  variables  and  therefore  cannot  lead  to  cycles  when  performed  by 
the  interpreter  we  have  described. 


apply  f(f,  u) 


procedure 


) 


f  :=  procedure  (h,  );  proced  h 

begin 


g  :=  procedure  (k, 
begin  ...  aggTg  k(k. 


); 


app!y  g(h,  ) 


end 


apply  f(f,  ) 


rocedure  (  ) 


f 

:=  procedure  (h.  k. 

) 

be, 

gin  ...  apply  k(h.  k. 

) 

• • •  end 

g 

:=  procedure  Ch.  k. 

) 

be; 

gin  . . .  apply  h(h,  k. 

) 

•  •  •  end 

apply  f(f,  g,  ) 


nn“l»v*»r*«o  *■?'-« 
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Several  interesting  questions  are  unresolved  at  this  writing.  We  do 
not  know  in  what  sense,  if  any,  the  necessary  condition  formulated  above 
is  a  sufficient  condition  for  the  formation  of  cycles  during  interpretation 
according  to  the  scheme  outlined.  Also,  we  do  not  know  a  general  method 
for  rewriting  block  structured  programs  so  that  cycles  will  not  arise 
during  execution. 


REPRESENTATION  OF  CONCURRENCY  IN  TOE  BASE  LANGUAGE 

A  subject  of  major  importance  in  the  design  of  the  base  language  is  the 
representation  of  concurrent  activities.  In  the  introduction  we  noted  that 
some  computations  inherently  involve  concurrent  processes  and  cannot  be 
Simulated  by  sequential  programs  —  also,  that  a  high  degree  of  concurrency 
within  computations  may  prove  essential  to  the  practical  realization  of  com¬ 
puter  systems  with  programming  generality.  To  these  motivations  we  may  add 
that  some  contemporary  source  languages,  notably  PL/l,  have  explicit  pro¬ 
vision  for  programming  concurrent  processes. 

We  regard  the  state  transitions  of  the  interpreter  as  representing  the 
progress  of  all  activities  in  a  computer  system  that  is  executing  many 
programs  simultaneously.  The  basic  requirements  for  representing  concurrent 
actions  in  the  interpreter  are  met  by  providing  for  many  sites  of  activity 
in  the  control  component  of  the  state  (Figure  3),  and  by  organizing  the 
local  structures  of  procedure  activations  as  a  tree  so  a  procedure  may 
spawn  independent,  concurrent  activations  of  component  procedures.  Multiple 
sites  of  activity  may  represent  many  actions  required  to  accomplish  different 
parts  of  one  computation  as  well  as  parallel  execution  of  many  independent 
computations . 

Consideration  of  concurrent  computation  brings  in  the  issue  of 
nondeterminacy  —  the  possibility  that  computed  results  will  depend  on  the 
relative  timing  with  which  the  concurrent  activities  are  carried  forward. 

The  work  of  Van  Horn  [27],  Rodriguez  [22]  and  others  has  shown  that  computer 
systems  can  be  designed  so  that  parallelism  in  computations  may  be  realized 
while  determinacy  is  guaranteed  for  any  program  written  for  the  system.  The 
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ability  of  a  computer  user  to  direct  the  system  to  carry  out  computations 
with  a  guarantee  of  determinacy  is  very  important.  Most  programs  are  in¬ 
tended  to  implement  a  functional  dependence  of  results  on  inputs,  and 
determinism  is  essential  to  the  verification  of  their  correctness. 

There  are  two  ways  of  providing  a  guarantee  of  determinacy  to  the  user 
of  a  computer  system.  The  distinction  is  whether  the  class  of  abstract  or 
base  language  programs  is  constrained  by  the  design  of  the  interpreter  to 
describe  only  determinate  computations.  If  this  is  the  case,  then  any 
abstract  program  resulting  from  compilation  will  be  determinate  in  execution. 
Furthermore,  if  the  compiler  is  itself  a  determinate  procedure,  then  each 
translatable  source  program  represents  a  determinate  procedure.  On  the 
other  hand,  if  the  design  of  the  interpreter  does  not  guarantee  determinacy 
of  abstract  programs,  determinacy  of  source  programs,  when  desired,  must  be 
ensured  by  the  translator. 

In  the  base  language,  it  is  necessary  to  provide  for  computations  that 
are  inherently  nondeten..^uate ,  such  as  the  example  of  a  process  awaiting  the 
first  response  from  either  of  two  terminals.  We  want  to  include  in  the  base 
language  primitive  features  for  representing  essential  forms  of  nondeterminacy . 
In  principle,  we  wish  to  guarantee  that  any  (base  language)  procedure  that 
does  not  use  these  features  will  be  determinate  in  its  operation.  Further¬ 
more,  use  of  base  language  primitives  for  the  construction  of  nonde terminate 
procedures  is  intended  to  be  such  that  the  choice  among  alternative  out¬ 
comes  always  originates  from  the  source  intended  by  the  program  author,  and 
never  from  timing  relationships  unrelated  to  his  computation. 

Our  current  thoughts  regarding  representation  of  base  language  procedures 
so  as  to  guarantee  determinacy  are  based  on  data  flow  representations  for  pro¬ 
grams  in  which  each  operation  is  activated  by  the  arrival  of  its  operands, 
and  each  result  is  transmitted,  as  soon  as  it  is  ready,  to  those  operations 
requiring  its  use.  Rodriguez  [22]  has  formulated  a  data  flow  model  that 
applies  to  programs  involving  assignment,  conditional,  and  iteration  state¬ 
ments,  and  data  represented  by  simple  variables.  Procedures  represented  by 
Rodriguez  program  graphs  are  naturally  parallel  and  the  rules  for  their  exe¬ 
cution  guarantee  determinacy.  In  [3],  Dennis  has  given  a  similar  program 
graph  model  for  procedures  that  transform  data  structures,  but  do  not  involve 
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conditioial  or  iteration  steps.  Determinacy  is  guaranteed  for  these  program 
graphs  if  they  satisfy  a  readily  testable  condition. 

We  hope  to  be  successful  in  combining  and  extending  these  two  models 
to  obtain  a  satisfactory  data  flow  model  for  all  determinate  procedures. 

If  this  objective  can  be  achieved,  we  expect  to  use  program  graphs  as  the 
nucleus  of  the  base  language.  On  the  basis  of  improved  understanding  of 
parallel  programs  obtained  by  recent  research  ou  program  schemes  by  Karp 
and  Miller  [11],  Paterson  [21],  Slutz  [23],  and  Keller  [12],  we  are  opti¬ 
mistic  about  finding  an  inherently  determinate  scheme  for  representing  the 
concurrency  present  in  most  algorithms. 


CONCLUSION 


This  article  has  been  an  introduction  to  the  goals,  philosophy  and 
methods  of  our  current  work  on  the  design  of  a  base  language.  The  material 
presented  is  an  "instantaneous  description"  of  an  activity  that  still  has 
far  to  go  —  many  issues  need  to  be  satisfactorily  resolved  before  we  will 
be  pleased  with  our  effort.  In  addition  to  the  representation  of  concurrency, 
the  base  language  must  encompass  certain  concepts  and  capabilities  beyond 
those  normally  provided  in  contemporary  source  languages.  Four  aspects  of 
this  kind  are:  1.  Generation  and  transformation  of  information  structures 
that  share  component  structures;  2.  Concurrent  processes  that,  in  pairs, 
have  producer- consumer  relationships;  3.  Programming  systems  that  are  able 
to  generate  base  language  programs  and  mor.itor  their  execution;  and 
4.  Provision  for  controlling  and  sharing  access  to  procedures  and  data  struc¬ 
tures  among  users  of  a  computer  system.  We  are  continuing  investigation  of 
how  these  capabilities  should  be  incorporated  in  the  base  language.  Some 
ideas  on  intercommunicating  processes  have  been  reported  briefly  [5].  Some 
thoughts  on  program  monitoring  and  controlled  sharing  of  information  are 
given  by  Dennis  and  Van  Korn  [6],  and  by  Vanderbilt  [25]. 
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